75 research outputs found

    SegNetr: Rethinking the local-global interactions and skip connections in U-shaped networks

    Full text link
    Recently, U-shaped networks have dominated the field of medical image segmentation due to their simple and easily tuned structure. However, existing U-shaped segmentation networks: 1) mostly focus on designing complex self-attention modules to compensate for the lack of long-term dependence based on convolution operation, which increases the overall number of parameters and computational complexity of the network; 2) simply fuse the features of encoder and decoder, ignoring the connection between their spatial locations. In this paper, we rethink the above problem and build a lightweight medical image segmentation network, called SegNetr. Specifically, we introduce a novel SegNetr block that can perform local-global interactions dynamically at any stage and with only linear complexity. At the same time, we design a general information retention skip connection (IRSC) to preserve the spatial location information of encoder features and achieve accurate fusion with the decoder features. We validate the effectiveness of SegNetr on four mainstream medical image segmentation datasets, with 59\% and 76\% fewer parameters and GFLOPs than vanilla U-Net, while achieving segmentation performance comparable to state-of-the-art methods. Notably, the components proposed in this paper can be applied to other U-shaped networks to improve their segmentation performance

    DiffSeer: Difference-based Dynamic Weighted Graph Visualization

    Full text link
    Existing dynamic weighted graph visualization approaches rely on users' mental comparison to perceive temporal evolution of dynamic weighted graphs, hindering users from effectively analyzing changes across multiple timeslices. We propose DiffSeer, a novel approach for dynamic weighted graph visualization by explicitly visualizing the differences of graph structures (e.g., edge weight differences) between adjacent timeslices. Specifically, we present a novel nested matrix design that overviews the graph structure differences over a time period as well as shows graph structure details in the timeslices of user interest. By collectively considering the overall temporal evolution and structure details in each timeslice, an optimization-based node reordering strategy is developed to group nodes with similar evolution patterns and highlight interesting graph structure details in each timeslice. We conducted two case studies on real-world graph datasets and in-depth interviews with 12 target users to evaluate DiffSeer. The results demonstrate its effectiveness in visualizing dynamic weighted graphs

    Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

    Full text link
    In the realm of expressive Text-to-Speech (TTS), explicit prosodic boundaries significantly advance the naturalness and controllability of synthesized speech. While human prosody annotation contributes a lot to the performance, it is a labor-intensive and time-consuming process, often resulting in inconsistent outcomes. Despite the availability of extensive supervised data, the current benchmark model still faces performance setbacks. To address this issue, a two-stage automatic annotation pipeline is novelly proposed in this paper. Specifically, in the first stage, we propose contrastive text-speech pretraining of Speech-Silence and Word-Punctuation (SSWP) pairs. The pretraining procedure hammers at enhancing the prosodic space extracted from joint text-speech space. In the second stage, we build a multi-modal prosody annotator, which consists of pretrained encoders, a straightforward yet effective text-speech feature fusion scheme, and a sequence classifier. Extensive experiments conclusively demonstrate that our proposed method excels at automatically generating prosody annotation and achieves state-of-the-art (SOTA) performance. Furthermore, our novel model has exhibited remarkable resilience when tested with varying amounts of data.Comment: Submitted to ICASSP 202

    TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection

    Full text link
    Thanks to recent advancements in end-to-end speech modeling technology, it has become increasingly feasible to imitate and clone a user`s voice. This leads to a significant challenge in differentiating between authentic and fabricated audio segments. To address the issue of user voice abuse and misuse, the second Audio Deepfake Detection Challenge (ADD 2023) aims to detect and analyze deepfake speech utterances. Specifically, Track 2, named the Manipulation Region Location (RL), aims to pinpoint the location of manipulated regions in audio, which can be present in both real and generated audio segments. We propose our novel TranssionADD system as a solution to the challenging problem of model robustness and audio segment outliers in the trace competition. Our system provides three unique contributions: 1) we adapt sequence tagging task for audio deepfake detection; 2) we improve model generalization by various data augmentation techniques; 3) we incorporate multi-frame detection (MFD) module to overcome limited representation provided by a single frame and use isolated-frame penalty (IFP) loss to handle outliers in segments. Our best submission achieved 2nd place in Track 2, demonstrating the effectiveness and robustness of our proposed system

    Loop closure detection of visual SLAM based on variational autoencoder

    Get PDF
    Loop closure detection is an important module for simultaneous localization and mapping (SLAM). Correct detection of loops can reduce the cumulative drift in positioning. Because traditional detection methods rely on handicraft features, false positive detections can occur when the environment changes, resulting in incorrect estimates and an inability to obtain accurate maps. In this research paper, a loop closure detection method based on a variational autoencoder (VAE) is proposed. It is intended to be used as a feature extractor to extract image features through neural networks to replace the handicraft features used in traditional methods. This method extracts a low-dimensional vector as the representation of the image. At the same time, the attention mechanism is added to the network and constraints are added to improve the loss function for better image representation. In the back-end feature matching process, geometric checking is used to filter out the wrong matching for the false positive problem. Finally, through numerical experiments, the proposed method is demonstrated to have a better precision-recall curve than the traditional method of the bag-of-words model and other deep learning methods and is highly robust to environmental changes. In addition, experiments on datasets from three different scenarios also demonstrate that the method can be applied in real-world scenarios and that it has a good performance

    Total saponins from Trillium tschonoskii Maxim promote neurological recovery in model rats with post-stroke cognitive impairment

    Get PDF
    Total saponins from Trillium tschonoskii Maxim (TSTT), a bioactive component of local natural herbs in the Enshi area, China, have been demonstrated to have functions of restoring cognitive capacity and promoting axonal regeneration post-stroke, but the mechanism of this process remains unclear. The hippocampus is a critical tissue for controlling learning and memory capacity, and the sonic hedgehog (Shh) signaling pathway plays a major role in the patterning and synaptic plasticity of hippocampal neural circuits. Therefore, we aimed to investigate whether TSTT could restore learning and cognitive functions by modulating the Shh pathway in rats with post-stroke cognitive impairment (PSCI). The ischemia model was established by permanent middle cerebral artery occlusion (MCAO) in 100 Sprague–Dawley (SD) rats, and the model rats were administered using TSTT (100 mg/kg) or donepezil hydrochloride as the positive control (daily 0.45 mg/kg, DON) for 4 weeks after the operation. As assessed by the Morris water maze test, the cognitive function of PSCI rats was significantly improved upon TSTT treatment. Meanwhile, the cerebral infarct volume reduced with TSTT, as shown by HE and TTC staining, and the number of Nissl bodies and dendritic spine density were significantly increased, as shown by Nissl and Golgi staining. In addition, TSTT upregulated PSD-95, SYN, and GAP-43, and inhibited neuronal apoptosis, as evidenced by increased Bcl-2 levels along with decreased Bax and caspase-3 expression. TSTT could also significantly upregulate Shh, Ptch1, Smo, and Gli1 proteins, indicating the activation of the Shh signaling pathway. Therefore, TSTT can protect PSCI rats by inhibiting apoptosis and promoting neuronal synaptic remodeling. The Shh pathway is also involved

    Extremely thin perfect absorber by generalized multipole bianisotropic effect

    Full text link
    Symmetry breaking plays a crucial role in understanding the fundamental physics underlying numerous physical phenomena, including the electromagnetic response in resonators, giving rise to intriguing effects such as directional light scattering, supercavity lasing, and topologically protected states. In this work, we demonstrate that adding a small fraction of lossy metal (as low as 1×10−61\times10^{-6} in volume), to a lossless dielectric resonator breaks inversion symmetry thereby lifting its degeneracy, leading to a strong bianisotropic response. In the case of the metasurface composed of such resonators, this effect leads to unidirectional perfect absorption while maintaining nearly perfect reflection from the opposite direction. We have developed more general Onsager-Casimir relations for the polarizabilities of particle arrays, taking into account the contributions of quadrupoles, which shows that bianisotropy is not solely due to dipoles, but also involves high-order multipoles. Our experimental validation demonstrates an extremely thin terahertz-perfect absorber with a wavelength-to-thickness ratio of up to 25,000, where the material thickness is only 2% of the theoretical minimum thickness dictated by the fundamental limit. Our findings have significant implications for a variety of applications, including energy harvesting, thermal management, single-photon detection, and low-power directional emission

    An Hα\alpha Imaging Survey of the Low-surface-brightness Galaxies Selected from the Fall Sky Region of the 40%\% ALFALFA \ion{H}{1} Survey

    Full text link
    We present the observed Hα\alpha flux and derived star formation rates (SFRs) for a fall sample of low−-surface−-brightness galaxies (LSBGs). The sample is selected from the fall sky region of the 40%\% ALFALFA {\ion{H}{1}} survey −- SDSS DR7 photometric data, and all the HαH\alpha images were obtained using the 2.16 m telescope, operated by the National Astronomy Observatories, Chinese Academy of Sciences. A total of 111 LSBGs were observed and HαH\alpha flux was measured in 92 of them. Though almost all the LSBGs in our sample are {\ion{H}{1}}−-rich, their SFRs derived from the extinction and filter−-transmission−-corrected HαH\alpha flux, are less than 1M_{\sun}yr−1yr^{-1}. LSBGs and star forming galaxies have similar {\ion{H}{1}} surface densities, but LSBGs have much lower SFRs and SFR surface densities than star−-forming galaxies. Our results show that LSBGs deviate from the Kennicutt-Schmidt law significantly, which indicate that they have low star formation efficiency. The SFRs of LSBGs are close to average SFRs in Hubble time and support the previous arguments that most of the LSBGs are stable systems and they tend to seldom contain strong interactions or major mergers during their star formation histories

    An Hα\alpha Imaging Survey of the Low-surface-brightness Galaxies Selected from the Fall Sky Region of the 40%\% ALFALFA \ion{H}{1} Survey

    Full text link
    We present the observed Hα\alpha flux and derived star formation rates (SFRs) for a fall sample of low−-surface−-brightness galaxies (LSBGs). The sample is selected from the fall sky region of the 40%\% ALFALFA {\ion{H}{1}} survey −- SDSS DR7 photometric data, and all the HαH\alpha images were obtained using the 2.16 m telescope, operated by the National Astronomy Observatories, Chinese Academy of Sciences. A total of 111 LSBGs were observed and HαH\alpha flux was measured in 92 of them. Though almost all the LSBGs in our sample are {\ion{H}{1}}−-rich, their SFRs derived from the extinction and filter−-transmission−-corrected HαH\alpha flux, are less than 1M_{\sun}yr−1yr^{-1}. LSBGs and star forming galaxies have similar {\ion{H}{1}} surface densities, but LSBGs have much lower SFRs and SFR surface densities than star−-forming galaxies. Our results show that LSBGs deviate from the Kennicutt-Schmidt law significantly, which indicate that they have low star formation efficiency. The SFRs of LSBGs are close to average SFRs in Hubble time and support the previous arguments that most of the LSBGs are stable systems and they tend to seldom contain strong interactions or major mergers during their star formation histories
    • …
    corecore